Skip to main content

All Questions

0votes
1answer
75views

The complexity order of regret (especially in online reinforcement learning)?

In online reinforcement learning theory, how to judge the complexity order of regret, if there are two or more terms in there? For example, the state space is $X$, the action space is $A$, the episode ...
white's user avatar

close